Self-calibrating Neural-Probabilistic Model for Authorship Verification Under Covariate Shift

نویسندگان

چکیده

We are addressing two fundamental problems in authorship verification (AV): Topic variability and miscalibration. Variations the topic of disputed texts a major cause error for most AV systems. In addition, it is observed that underlying probability estimates produced by deep learning mechanisms oftentimes do not match actual case counts respective training data. As such, poorly calibrated. expanding our framework from PAN 2020 to include Bayes factor scoring (BFS) an uncertainty adaptation layer (UAL) address both problems. Experiments with 2020/21 shared task data show proposed method significantly reduces sensitivities topical variations improves system’s calibration.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Selection Under Covariate Shift

A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, or active learning scenarios. The violation of this assumption— known as the covariate shift—causes a heavy bias in standard generalization error estimation schemes such as cross-validati...

متن کامل

Probabilistic Anomaly Detection Method for Authorship Verification

Authorship verification is the task of determining if a given text is written by a candidate author or not. In this paper, we present a first study on using an anomaly detection method for the authorship verification task. We have considered a weakly supervised probabilistic model based on a multivariate Gaussian distribution. To evaluate the effectiveness of the proposed method, we conducted e...

متن کامل

Discriminative Learning Under Covariate Shift

We address classification problems for which the training instances are governed by an input distribution that is allowed to differ arbitrarily from the test distribution—problems also referred to as classification under covariate shift. We derive a solution that is purely discriminative: neither training nor test distribution are modeled explicitly. The problem of learning under covariate shif...

متن کامل

Distance Metric Learning under Covariate Shift

Learning distance metrics is a fundamental problem in machine learning. Previous distance-metric learning research assumes that the training and test data are drawn from the same distribution, which may be violated in practical applications. When the distributions differ, a situation referred to as covariate shift, the metric learned from training data may not work well on the test data. In thi...

متن کامل

Semi-supervised speaker identification under covariate shift

In this paper, we propose a novel semi-supervised speaker identification method that can alleviate the influence of non-stationarity such as session dependent variation, the recording environment change, and physical conditions/emotions. We assume that the voice quality variants follow the covariate shift model, where only the voice feature distribution changes in the training and test phases. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-85251-1_12